167 research outputs found

    A robust illumination-invariant face recognition based on fusion of thermal IR, maximum filter and visible image

    Get PDF
    Face recognition has many challenges especially in real life detection, whereby to maintain consistency in getting an accurate recognition is almost impossible. Even for well-established state-of-the-art algorithms or methods will produce low accuracy in recognition if it was conducted under poor or bad lighting. To create a more robust face recognition with illumination invariant, this paper proposed an algorithm using a triple fusion approach. We are also implementing a hybrid method that combines the active approach by implementing thermal infrared imaging and also the passive approach of Maximum Filter and visual image. These approaches allow us to improve the image pre-processing as well as feature extraction and face detection, even if we capture a person’s face image in total darkness. In our experiment, Extended Yale B database are tested with Maximum Filter and compared against other state-of-the-filters. We have conduct-ed several experiments on mid-wave and long-wave thermal Infrared performance during pre-processing and saw that it is capable to improve recognition beyond what meets the eye. In our experiment, we found out that PCA eigenface cannot be produced in a poor or bad illumination. Mid-wave thermal creates the heat signature in the body and the Maximum Filter maintains the fine edges that are easily used by any classifiers such as SVM, OpenCV or even kNN together with Euclidian distance to perform face recognition. These configurations have been assembled for a face recognition portable robust system and the result showed that creating fusion between these processed image illumination invariants during preprocessing show far better results than just using visible image, thermal image or maximum filtered image separately

    Tree-based mining contrast subspace

    Get PDF
    All existing mining contrast subspace methods employ density-based likelihood contrast scoring function to measure the likelihood of a query object to a target class against other class in a subspace. However, the density tends to decrease when the dimensionality of subspaces increases causes its bounds to identify inaccurate contrast subspaces for the given query object. This paper proposes a novel contrast subspace mining method that employs tree-based likelihood contrast scoring function which is not affected by the dimensionality of subspaces. The tree-based scoring measure recursively binary partitions the subspace space in the way that objects belong to the target class are grouped together and separated from objects belonging to other class. In contrast subspace, the query object should be in a group having a higher number of objects of the target class than other class. It incorporates the feature selection approach to find a subset of one-dimensional subspaces with high likelihood contrast score with respect to the query object. Therefore, the contrast subspaces are then searched through the selected subset of one-dimensional subspaces. An experiment is conducted to evaluate the effectiveness of the tree-based method in terms of classification accuracy. The experiment results show that the proposed method has higher classification accuracy and outperform the existing method on several real-world data sets

    A performance comparison of feature extraction methods for sentiment analysis

    Get PDF
    Sentiment analysis is the task of classifying documents according to their sentiment polarity. Before classification of sentiment documents, plain text documents need to be transformed into workable data for the system. This step is known as feature extraction. Feature extraction produces text representations that are enriched with information in order to have better classification results. The experiment in this work aims to investigate the effects of applying different sets of features extracted and to discuss the behavior of the features in sentiment analysis. These features extraction methods include unigrams, bigrams, trigrams, Part-Of-Speech (POS) and Sentiwordnet methods. The unigrams, part-of-speech and Sentiwordnet features are word based features, whereas bigrams and trigrams are phrase-based features. From the results of the experiment obtained, phrase based features are more effective for sentiment analysis as the accuracies produced are much higher than word based features. This might be due to the fact that word based features disregards the sentence structure and sequence of original text and thus distorting the original meaning of the text. Bigrams and trigrams features retain some sequence of the sentences thus contributing to better representations of the text

    An optimized multi-layer ensemble framework for sentiment analysis

    Get PDF
    Public opinion plays an important role in decision making tasks of various fields. Sentiment Analysis is a key task in summarizing sentiment opinions as it classifies opinion documents according to its sentiment group of positive and negative. Machine learning based classification is efficient and versatile. The ensemble concept is used to improve classification accuracy by combining the decision of multiple classifiers. In this work, a framework for sentiment analysis is designed to extend the concept of ensemble upon all subtasks of machine learning classification in order to achieve better analysis. There are 3 subtasks in machine learning based sentiment analysis which are feature extraction, feature selection and classification. The ensemble concept is applied to all 3 tasks by combining different methods to perform the tasks and combine their results. optimization is performed by using Genetic Algorithm to find the combination of methods that could perform better. The proposed framework is tested on 4 different domain datasets and the sentiment analysis accuracy is shown to be very high. Future works includes testing the framework on different domains of classification and different optimization algorithm

    Ensemble classifier and resampling for imbalanced multiclass learning

    Get PDF
    An ensemble classifier called DECIML has previously reported that the classifier is able to perform on benchmark data compared to several single classifiers and ensemble classifiers such as AdaBoost, Bagging and Random Forest.The implementation of the ensemble using sampling was carried out in order to investigate if there are any improvements in the classification performances of the DECIML.Random sampling with replacement (SWR) method is applied to minority class in the imbalanced multiclass data. Results show that the SWR is able to increase the average performance of the ensemble classifie

    Improving the identification and classification of Malaysian medicinal leaf images using ensemble method

    Get PDF
    Malaysia has abundant natural resources especially plants which can be used for medicinal or herbal purposes. However, there is less research to preserve the knowledge of these resources to be utilized by the community in identifying useful medicinal plants using computing tools. This paper presents the implementation of digital opportunities for Malaysian medicinal plants via leaf image identification and classification. Of late, experts in traditional medicine and herbs have become few and the younger generation are mostly unknowledgeable about the medicinal and herbal properties of the plants. Therefore, this work is important in assisting the community (rural and urban) to identify and possibly share the knowledge of Malaysian medicinal plants with the future generation. The focus of this paper is to prepare the identification phase before the actual system is developed. Thus, the implementation of such a system is vital in order to enable the community to preserve these important resources

    An evolutionary based features construction methods for data summarization approach

    Get PDF
    Coral reefs are on course to become the first ecosystem that human activity will eliminate entirely from the Earth, a leading United Nations scientist claims. It is predicted that this event will occur before the end of the present century, which means that there are children already born who will live to see a world without coral. Coral reefs are important for the immense biodiversity of their ecosystems. They contain a quarter of all marine species. This research addresses the question whether a data summarization approach can be utilized to predict the survival of Coral Reefs in Malaysia by identifying the survival factors for these Coral Reefs. A data summarization approach is proposed due to its capability to learn data stored in multiple tables. In other words, this research will discuss the application of genetic algorithm to optimize the feature construction process from the Coral Reefs data to generate input data for the data summarization method called Dynamic Aggregation of Relational Attributes (DARA). The DARA algorithm will be applied to summarize data stored in the non-target tables by clustering them into groups, where multiple records stored in non­target tables correspond to a single record i,tored in a target table. Here, feature construction methods are applied in order to improve the descriptive accuracy of the DARA algorithm.This research proposes novel feature construction methods, called Variable Length Feature Construction without Substitution (VLFCWOS) and Variable Length Feature Construction with Substitution(VLFCWS), in order to construct a set of relevant features in learning relational data. These methods are proposed to improve the descriptive accuracy of the summarized data. In the process of summarizing relational data, a genetic algorithm is also applied and several feature scoring measures are evaluated in order to find the best set of relevant constructed features. In this work, we empirically compare the predictive accuracies of classification tasks based on the proposed feature construction methods and also the existing feature construction methods. The experimental results show that the predictive accuracy of classifying data that are summarized based on VLFCWS method using Total Cluster Entropy combined with Information Gain (CE-JG) as feature scoring outperforms in most cases

    Development of a genetic-based hierarchical agglomerative clustering technique for parallel clustering of bilingual corpora based on reduced terms

    Get PDF
    In this project, we report on our work on applying Hierarchical Agglomerative Clustering (HAC) to a large corpus of documents where each appears both in Malay and English. We cluster these documents for each language and compare the results both with respect to the content of clusters produced. On the data available, the results of clustering one language resemble the other, provided the number of clusters required is relatively small. Further? we study the effects of changing the method used to compute the inter-clusters distance that includes single link, complete link and average link distance between clusters. Finally, we describe an experiment employing a genetic algorithm to fine-tune the individual term weights in order to reproduce more closely a predefined set of clusters
    • …
    corecore